Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LoRA] Adds support for bias in LoRA #5733

Open
wants to merge 44 commits into
base: main
Choose a base branch
from

Conversation

followumesh
Copy link

Motivation
PEFT, https://github.com/foundation-model-stack/fms-hf-tuning includes support for tuning LoRA bias. This PR enables bias for lora, so the adapters with bias will work with vLLM.

Changes Included

  • LoRA bias support for different types of modules.
  • LoRA bias support for fully sharded LoRA.
  • Test file test-lora-bias.py

Copy link
Collaborator

@Yard1 Yard1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add an argument to the engine enable_lora_bias and avoid initializing the bias tensors if it's false? If the user knows none of their loras will have bias, we can save memory.

@followumesh
Copy link
Author

@Yard1 Thanks for reviewing the PR. I have added the enable_lora_bias flag (default set to false), which prevents the allocation of lora bias tensors when false.

@njhill
Copy link
Member

njhill commented Jun 27, 2024

Related: #5930

Copy link
Collaborator

@Yard1 Yard1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, can we also add an e2e test?

@DarkLight1337
Copy link
Member

To speed up the CI queue for #5905, I've cancelled the distributed tests for the latest CI run in this PR since they won't pass anyway until #5905 has been merged. Please merge main into your branch after that happens so that the CI can pass once again.

@followumesh
Copy link
Author

@Yard1 Thanks for reviewing. I've added an e2e test for the lora_bias support.

@njhill
Copy link
Member

njhill commented Jul 29, 2024

@followumesh you need to run ./format.sh to fix the linting errors

@njhill
Copy link
Member

njhill commented Aug 1, 2024

@followumesh apologies, this needs another conflict resolution!

@followumesh
Copy link
Author

@Yard1 @njhill I have included the e2e test and merged the recent changes. Can you please review the commit? Thanks

vllm/lora/utils.py Outdated Show resolved Hide resolved
Copy link
Member

@njhill njhill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all of the work on this @followumesh!

vllm/lora/utils.py Outdated Show resolved Hide resolved
vllm/lora/models.py Outdated Show resolved Hide resolved
vllm/lora/layers.py Outdated Show resolved Hide resolved
vllm/lora/layers.py Outdated Show resolved Hide resolved
vllm/lora/layers.py Outdated Show resolved Hide resolved
):
self.reset_lora(index)

if self.tp_size > 1:
lora_a = self.slice_lora_a(lora_a)
lora_b = self.slice_lora_b(lora_b)
if bias is not None:
bias = self.slice_bias(bias)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The typing looks wrong here ... bias is a tensor but the method expects a list

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here the slice_bias mimics slice_lora_b.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm OK fair enough, I guess the typing errors are preexisting.

Comment on lines +698 to +699
self, bias: List[Union[torch.Tensor,
None]]) -> List[Union[torch.Tensor, None]]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be better to use tuple here instead of list. Clearer when there are a fixed number of elements and performance is generally better.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, I have let it mimic slice_lora_b.

vllm/engine/arg_utils.py Show resolved Hide resolved
vllm/lora/models.py Outdated Show resolved Hide resolved
@followumesh
Copy link
Author

@njhill I have addressed your comments above. Can you please review this again? Thanks

Copy link
Member

@njhill njhill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @followumesh and sorry for the delay.

There's one remaining small but major thing to fix (and tests are failing due to this).

Comment on lines 396 to 400
if not self.lora_config.bias_enabled:
module_lora.bias = None
raise ValueError(
f"Adapter bias cannot be used for {module_name}"
" without --enable-lora-bias.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't look right and is causing blanket lora failures. I think it should be:

Suggested change
if not self.lora_config.bias_enabled:
module_lora.bias = None
raise ValueError(
f"Adapter bias cannot be used for {module_name}"
" without --enable-lora-bias.")
if module_lora.bias is not None and not self.lora_config.bias_enabled:
raise ValueError(
f"Adapter bias cannot be used for {module_name}"
" without --enable-lora-bias.")

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incorporated the comment.

):
self.reset_lora(index)

if self.tp_size > 1:
lora_a = self.slice_lora_a(lora_a)
lora_b = self.slice_lora_b(lora_b)
if bias is not None:
bias = self.slice_bias(bias)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm OK fair enough, I guess the typing errors are preexisting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants